733 research outputs found
Algorithms and Adaptivity Gaps for Stochastic k-TSP
Given a metric and a , the classic
\textsf{k-TSP} problem is to find a tour originating at the
of minimum length that visits at least nodes in . In this work,
motivated by applications where the input to an optimization problem is
uncertain, we study two stochastic versions of \textsf{k-TSP}.
In Stoch-Reward -TSP, originally defined by Ene-Nagarajan-Saket [ENS17],
each vertex in the given metric contains a stochastic reward .
The goal is to adaptively find a tour of minimum expected length that collects
at least reward ; here "adaptively" means our next decision may depend on
previous outcomes. Ene et al. give an -approximation adaptive
algorithm for this problem, and left open if there is an -approximation
algorithm. We totally resolve their open question and even give an
-approximation \emph{non-adaptive} algorithm for this problem.
We also introduce and obtain similar results for the Stoch-Cost -TSP
problem. In this problem each vertex has a stochastic cost , and the
goal is to visit and select at least vertices to minimize the expected
\emph{sum} of tour length and cost of selected vertices. This problem
generalizes the Price of Information framework [Singla18] from deterministic
probing costs to metric probing costs.
Our techniques are based on two crucial ideas: "repetitions" and "critical
scaling". We show using Freedman's and Jogdeo-Samuels' inequalities that for
our problems, if we truncate the random variables at an ideal threshold and
repeat, then their expected values form a good surrogate. Unfortunately, this
ideal threshold is adaptive as it depends on how far we are from achieving our
target , so we truncate at various different scales and identify a
"critical" scale.Comment: ITCS 202
Forward and Inverse Approximation Theory for Linear Temporal Convolutional Networks
We present a theoretical analysis of the approximation properties of
convolutional architectures when applied to the modeling of temporal sequences.
Specifically, we prove an approximation rate estimate (Jackson-type result) and
an inverse approximation theorem (Bernstein-type result), which together
provide a comprehensive characterization of the types of sequential
relationships that can be efficiently captured by a temporal convolutional
architecture. The rate estimate improves upon a previous result via the
introduction of a refined complexity measure, whereas the inverse approximation
theorem is new
Approximation theory of transformer networks for sequence modeling
The transformer is a widely applied architecture in sequence modeling
applications, but the theoretical understanding of its working principles is
limited. In this work, we investigate the ability of transformers to
approximate sequential relationships. We first prove a universal approximation
theorem for the transformer hypothesis space. From its derivation, we identify
a novel notion of regularity under which we can prove an explicit approximation
rate estimate. This estimate reveals key structural properties of the
transformer and suggests the types of sequence relationships that the
transformer is adapted to approximating. In particular, it allows us to
concretely discuss the structural bias between the transformer and classical
sequence modeling methods, such as recurrent neural networks. Our findings are
supported by numerical experiments
Natural Graph Wavelet Packet Dictionaries
We introduce a set of novel multiscale basis transforms for signals on graphs
that utilize their "dual" domains by incorporating the "natural" distances
between graph Laplacian eigenvectors, rather than simply using the eigenvalue
ordering. These basis dictionaries can be seen as generalizations of the
classical Shannon wavelet packet dictionary to arbitrary graphs, and do not
rely on the frequency interpretation of Laplacian eigenvalues. We describe the
algorithms (involving either vector rotations or orthogonalizations) to
construct these basis dictionaries, use them to efficiently approximate graph
signals through the best basis search, and demonstrate the strengths of these
basis dictionaries for graph signals measured on sunflower graphs and street
networks
- …